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C " 3 Abstract: Let Sn be a sum of independent centered random variables satisfying Bernstein's 

£\J , condition with parameter e, and <r 2 be the variance of S n - Bennett's inequality states that, for 

any x > 0, P(S n > xa) < cxp | — }, where x = ^^-HfL = . We give several inequalities 

which improve this inequality to optimal order, in the spirit of Talagrand's refinement of 
Hoeffding's inequality. In particular, we sharpen this inequality by adding a missing factor 
F(x) with exponentially decay rate. The interesting feature of our bound is that it recovers 
closely the shape of the standard normal tail 1 — <E>(x) for all x > 0, in contrast to Bennett's 
bound which does not share this property. Also, compared with the classical Cramer large 
deviations, our inequality has the advantage that it is valid for all x > 0. 



01 

Oh 



AMS 2000 subject classifications: Primary 60G50, 60F10; secondary 60F05. 
Keywords and phrases: Sums of independent random variables, large deviations, expo- 



^ (-H nential inequalities, asymptotic expansions, Bennett's inequality, tail probabilities. 

S ■ 

1. Introduction 



Let £i, be a sequence of independent centered random variables satisfying Bernstein's condi- 
| tion 

<N ■ |E^| < ifc!e fc - 2 E4 2 , for fe> 3 and i= l,...,n, (1.1) 

0\ ' 2 

^ . for some constant e > 0. Denote 

(N 

§ i S„ = J> an d a 2 =^Ee 4 2 . (1.2) 

(N t 

Starting from the seminal papers of Cramer [5] and Bernstein [4] , the estimation of the tail prob- 
abilities P (S n > x) , for large x > 0, has attracted much attention. By employing the exponential 
Markov inequality and an upper bound for the moment generating function Ee A ^ i , Bennett [1] 
| obtained the following inequality (cf. (8a) of [1]): for all x > 0, 



s \ f x 2 1 2x 

P(S n > xa) < B ( x, — } := exp < > , where x = =. (1.3) 

I 2 J i + y/l + 2xe/a 



a 



Various generalizations of inequality (1.3) and related results (including Bernstein's inequality) 
can be found in Hocffding [15], Statulevicius [29], Nagaev [18], Petrov [20], Talagrand [31], van de 
Geer [32], De La Peha [9], Lcsigne and Vohry [16], Dcdcckcr and Doukhan [6], Dcdeckcr and Pricur 
[7, 8], Doukhan and Neumann [10], van de Geer and Lederer [33], Merlevede, Peligrad and Rio 
[17], Rio [26, 27] and Fan, Grama and Liu [12]. 

Cramer's large deviation result (cf. (2.2) of Section 2) suggests that Bennett's inequality (1.3) can 
be substantially refined by adding a missing factor of order — p^. In the case where the summands 
£i are assumed to be bounded & < 1, results of such type have been obtained by Eaton [11], Pinelis 
[22, 25] and Bentkus [2, 3]. In these paper, the generalized moment comparison method is used to 
obtain bounds of the type 

P(S n > x) < cP LC (Y > x), 
i 
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where c is an absolute constant, P LC (Y > x) is the log-concave majorant of the tail function 
P(Y > x) and Y is a specially constructed random variable. For instance, in Bentkus [3], c = ^- 
and Y = £1 + ... + £„, where £,'s are i.i.d. Bernoulli such that 

P(& = 1) = T~~T]~ and Pfe = -a 2 /n)= 1 . (1.4) 
1 + cr^/n 1 + (7 z /n 

These bounds restore the "missing" factor of order for large x and bounded £j's. However, for 
small x, by central limit theorem, these bounds are not sharp as the constant c therein exceeds 
than 1. Using the conjugate measure technique of Cramer, Talagrand (cf. (1.6) of [30]) showed 
that if the variables £i satisfy — b < < 1 for some constant b > 1, then there exists an universal 
constant K such that, for all < x < -St, 

P(S n >xa) < inf Ee^ s "- X ^ (^M(x) + K^J (1-5) 
< H n {x,a) (M(x)+K^j, (1.6) 



(sf?) (h^J f and \/27rM(x) is Mill's ratio: 
M(x) = (1 - $(x))cxp|y| with $(x) = J e'^dt. (1.7) 



Since M(x) = O \ > (1-6) improves on Hoeffding's bound H n (x, a) (cf. (2.8) of [15]) by adding 
a factor of order for < x < j^. Moreover, for small x, Talagrand's bound (1.6) recovers 
closely the shape of the standard normal tail 1 — $(x) when — — > 0. 

The scope of this paper is to give several improvements of the Bennett inequality (1.3), in 
particular, by adding a missing factor in the spirit of Talagrand's inequalities (1.5) and (1.6) under 
the less restrictive condition (1.1). In addition to the fact that £j are not assumed to be bounded, 
our bounds will be valid for any x > unlike the bound (1.6) which holds true only in the range 
< x < -p^. Our results will also imply Talagrand's bound (1.5) under (1.1). 

Let us illustrate briefly the results of the paper by using our Theorem 2.3: under Bernstein's 
condition (1.1), for all x > 0, 

P(5„ > xa) < B n (x, i) F 2 (x, ^ , (1.8) 
where F 2 (x, f ) < 1, F 2 (x, f ) = O(j^) for x = o(f ) and 

B n (x, -) = B (x, -) exp J -mj, I X \ ) \ , (1.9) 

with ip(t) = t — log(l + t), a nonncgativc convex function in t > 0. The bound in (1.8) improves 
Bennett's bound B (x, -) by the missing factor 



F 2 (x, — J exp < —nip 



In 

v/1 + 2xe/ a 
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The comparison between B n (x, ^) F 2 (x, ^) and Bennett's bound (1.3) is displayed in Figure 3. 
In particular, since F%(x, — ) < 1, in the i.i.d. case with E£f = 1, from (1.8) we get 



P(S n > nt) < B n (y/nt, -^=J = B (y/nt, -^=) exp{-c M n}, t > 0, 



(1.10) 



where Ct, £ > is a constant depending only on t and e. The bound (1.10) improves the Bennett's 
bound (1.3) and Theorem 9 in Pinelis and Utev [21] (see also Theorem 3.3 in Pinelis [23]; in the 
indicated theorems of [21] and [23] the bounds should be corrected as in [24]). 

In Theorem 2.5, we obtain the following one term expansion which improves inequality (1.8): 
for all < x < O.lf , 

P(S n > xa) = inf Ee^ 5 "- 3 ^ ( M (x) + 88.41 6-) , (1.11) 
A>o V a) 

where \8\ < 1. Note that equality (1.11) also improves Talagrand's inequality (1.5). Moreover, 
under Bernstein's condition (1.1), we have inf^>o Ee A< - s " _:r ' :r " ) < B n (x, -). If ^ are bounded ^ < 1, 
it holds inf^>o Ee x ( Sn ~ xa ' > < H n (x, a). The last inequality is sharp and attains the equality when 
(1.4) holds. ~ 

Our approach uses the conjugate distribution technique due to Cramer which is different from 
the methods used in Bennett [1], Eaton [11], Pinelis [22, 25] and Bcntkus [2, 3]. We refine the 
technique based on change of probability measure and derive sharp bounds for the cumulant 
function to obtain precise upper bounds for tail probabilities under Bernstein's condition. 

The paper is organized as follows. In Section 2, we present our main results. In Section 3, we 
state some auxiliary results to be used in the proofs of theorems. Sections 4, 5, 6 are devoted to 
the proofs of main results. 



2. Main Results 



All over the paper £i,...,£ n is a sequence of independent real random variables with E£j = 
and satisfying Bernstein's condition (1.1), S n and a 2 are denned by (1.2). We use the notations 
a A b = min{a, b}, aVi) = max{a, b} and a + = a V 0. 

Our first result is the following large deviation inequality valid for all x > 0. 



Theorem 2.1. For any S G (0, 1] and x > 0, 

P(S n >XCJ) < (1 - $ (x)) 

where 

„ _ 2x 
X ~ 1 + V 1 + 2(1 + 5)xe/a 
and C s = (^P + 212.813) V Moreover, 



l + C s {l + x)- 



1 



1 



+ 8 e e. 
—x-+o(x~) 
A a a 



e 
x— 
a 



and for S = 1, d = 275.306. 



(2.1) 



The interesting feature of the bound (2.1) is that it recovers closely the shape of the standard 
normal tail when x is moderate and r = — becomes small, which is not the case of Bennett's bound 
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B(x, ^) (see Figure 1). Our result can also be compared with Cramer's large deviation result in 
the i.i.d. case: under Cramer's condition that Ee 5 ^ 1 ' < oo for some S > 0, 



P(S n > xa) 
1 - $(x) 



= exp 



= A 



1 + 



(2.2) 



where A(-) is the Cramer series [19] and < x = o(y / n) (cf. Cramer [5] or Petrov [19]). Note that 
in the i.i.d. case Cramer's condition is equivalent to Bernstein's condition (1.1), and J; = 0(-4=) 
as n — > oo. With respect to Cramer's result, the advantage of (2.1) is that it is valid for all x > 0. 

An improvement of Bennett's bound (1.3) for large x can be obtained from Theorem 2.2 formu- 
lated below. 



Theorem 2.2. For all x > 0, 

P(S n > xa) < (1 - $ (x)) 



l + A 



< 



where x is defined in (1.3), 



A 



and 



Moreover, F\ (x, ^) 
x > 0. 



= l+o(l 



(*-f) 

Fi (x, - 
V a 

t w/ien x 



^i) F ^ £ -a 



— A 



(2.3) 
(2.4) 



y/l+2xe/a 



84.9 



M(x) 
1 + yl(a;, e/a)e/a 



v/tt(1 + £) 



O U t , -r 



oo, and Fi (x, -) < 1 



(2.5) 

(2.6) 
M(x) /or 



The advantage of Theorem 2.2 is that in the normal distribution function we have the 

expression x instead of the smaller term x figuring in Theorem 2.1, which represents a significant 
improvement. Inequality (2.3) improves Bennett's inequality (1.3) by the factor F\ (x, -J of order 
--j^Tj^jrj for x = o(^/j), which, following Talagrand [30], we call missing factor. The numeri- 
cal results displayed in Figures 1 show that bound (2.3) performs better than bound (2.1) and 
significantly better than Bennett's bound B(x, —). 

A further significant improvement of Bennett's inequality (1.3) for all x > is given by the 
following theorem: we replace the bound B (x, — ) by the following smaller one: 



B, 



= B ( x, — ) exp ■ 



-nij) 



2nyl + 2xe/a t 

where ip(t) = t — log(l +t) is a nonncgativc convex function in t > 0. 
Theorem 2.3. For x > 0, 



P(S n > xcr) 



< 
< 



F, 



(2.7) 
(2.8) 
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Fig 1. We display Bennett's bound B(x,r), bounds (2.1) with 5 = 1 and (2.3) as functions of x with r = 0.001. 



where 

F 2 (x, ~) = (M(x) + 27.99i? (a;-) ~) A 1 (2.9) 



(l-t+6t 2 ) 3 



if < t < i 



R(t) = { (i-3t)3/2(i- t) 7> y"_p 3 , y AQ j 



if t > |, 



is an increasing function. Moreover, for all < x < with < a < |, we have R(x^) < R(a). 
Ifa = 0.1, we have 27.99i?(a) < 88.41. 

To highlight the improvement of Theorem 2.3 over Bennett's bound, we note that B n (x, — ) < 
B(x, ^) and, in the i.i.d. case (or, more generally when ^ = -^=, for some constant cq > 0), 

Bn^V^)^) = B (Vnx, exp{-c x n}, (2-11) 

where > 0, ir > 0, is independent of n. Thus, the Bennett's bound is strengthened by adding 
a factor cxp{— c x n}. The second improvement in the right-hand side of (2.7) comes from the 
missing factor F 2 (x, which is of order jt^:, for moderate values of x satisfying < x = o(^). 
The numerical values of the missing factor F 2 (x, ^) are displayed in Figure 2. 

Our numerical results confirm that the bound B n (x, —)F 2 (x, — ) in (2.7) is significantly better 
than Bennett's bound B(x, — ) for all x > 0. For comparison, wc display the ratios of B n (x, r)F 2 (x, r) 
to B(x, r) in Figure 3 for various r = -7=. 

The following corollary improves inequality (2.1) of Theorem 2.1 in the range < x < a— with 
< a < §. 

Corollary 2.1. For all < x < af wii/i < a < |, 

P(S"„>.w) < (1 -$(£)) [l + 70.17i?(a)(l + x) -1 , (2.12) 

L o~ J 
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The missing factor F 2 (x, r) 




r = u.uui 
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Fig 2. The missing factor F2(x, r) is displayed as a function of x for various values of r ■ 



Ratio of bound (2.7) to B(x, r) 



^\r = 0.1 
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Fig 3. Ratio of B n (x,r)F2 (x,r) to B(x,r) as a function of x for various values of r = ^ = -4=. 
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wherex is defined in (1.3) and R(t) by (2.10). In particular, if a — 0.1 we have 70.17i?(a) < 221.63. 

For the lower bound of tail probabilities P(S n > xa), we have the following result, which 
complements Corollary 2.1. 

Theorem 2.4. For all < x < af with < a < 



P(5„ > xcr) > (1 - $ (x)) 1 - c a (1 + x) 



where x = A f v -t with A 



,22^ , and c a = 67.38 R ( —^—) < 1753.23. 

x = x ( 1 + 5.4.T— + o(x— ) J as x > 0. 



Moreover, 



a a 

In •particular, ifa = 0.1 we have c a — 67.38 R(j-) < 682.89. 

Combining Corollary 2.1 and Theorem 2.4, we obtain, for all < x < 0.1- , 



P(S*„ > xa) = 1 - $ [x(l + 9 lCl x-) 1 + 6» 2 c 2 (l + x) 



where ci, c 2 > are some absolute constants and \Q\\, \6%\ < 1. 

To close this section, we give an improvement of Talagrand's inequality (1.5). 

Theorem 2.5. For all < x < §§, 



w/ie 



P(S„ > xa) = inf Ee A ( s "-^F 3 (x, -) . 
F 3 (x, = M(x) + 27.99 6R (x^j ^, 



(2.13) 



(2.14) 



(2.15) 



R(t) is defined by (2.10) and |0| < 1. Moreover, inf A > Ee A ( s "- a:<T) < B n (x, f) < B(x, § ). In 
addition, for < x < O.lf , we /iaue 27.99i?(xf ) < 88.41. 

It is clear that our equality (2.14) implies Talagrand's inequality (1.5) with an information on 
the Talagrand's constant K under a less restrictive condition (Talagrand supposed that £j are 
bounded: — b < & < 1). 

Notice that (2.14) can be written in the following form: for < x < a|- and a € (0, |), 



P(5 n > Xd) 



= 1 + 27.99 6»ii?(a) 



M(x)inf A > Ee A ( s "- 2;CT ) * ' """" """ v ' M(x)cr 

= 1 + 70. 17 9 2 R (a) (1 + x) 



(2.16) 



where \0i\, |# 2 | < 1 and the last step holds since 

< Af(t) < 



2tt(1 + i) 



v/5r(l + t) 



t > 0, 



(2.17) 



(see Feller [13]). Equality (2.16) implies that the relative error between P(S n > xa) and M{x) inf a>o Ee 
converges to uniformly in the range < x = o (^) as j — > oo. In particular, when S n is the sums 
of Rademacher i.e. P(£j = —1) = P(£j = 1) = i, we have the following simulations in Figure 4. 
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Comparison for n = 100 
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3. Auxiliary results 

We consider the positive random variable 

?:=i 

(the Esscher transformation) so that EZ„(A) = 1. We introduce the conjugate probability measure 
Pa defined by 

dP x = Z n {X)dP. (3.1) 

Denote by E>, the expectation with respect to Pa- Then for any positive and measurable function 
/, 

E A /(&) = EeACi , t = l,...,n. 

Setting 

E^e A?i 

bi(A) = E>& = =p^ , i = l,...,n, 

and 

»7«(A) =& - &*(A), i = l, —,n, 
we obtain the following decomposition: 

X fe = J B fe (A)+y fe (A), fc = l,...,n, (3.2) 

where 

k k 

BkW = E 6 *( A ) and Ffe ( A ) = E w( A )- 

i=l t=l 

In the following, we give some lower and upper bounds of B n (X), which will be used in the 
proofs of theorems. 

Lemma 3.1. For all < A < e _1 , 

/- „ ^ a (l-1.5Ae)(l-Ae), , „ 1 - 0.5Ae , , 

(1 - 2.4A £ )A^ < i__JL_iA < r» < £ n (A) < Ao^. 

Proof. Since E£j = 0, by Jensen's inequality, we have Ee A ^ i > 1. Noting that 

E&e^ = E&(e A «' - 1) > 0, A > 0, 
by Taylor's expansion of e x , we get 



B n (X) < j^E&e 



i=l 



fc! 

i=l fc=2 

Using Bernstein's condition (1.1), we obtain, for all < A < 

n +oo . +oc 

EEfrl E ^ +1 ! ^ iA 2 a 2 £ ^(fc + l)(A £ ) fe - 2 

i=l fc=2 ' k=2 
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Combining (3.3) and (3.4), we get the desired upper bound of B n {\): for all < A < e -1 , 

n;il , 2 3 - 2Ae s2 2 1 - 0-5A£ , 2 
BnW < AA^AVe = Tj^Aa 2 . 

By Jensen's inequality and Bernstein's condition (1.1), 

(E£ 2 ) 2 < E£ 4 < 12e 2 Ee 2 , 

from which we get 

E£ 2 < 12e 2 . 

Using again Bernstein's condition (1.1), we have, for all < A < e" 1 , 

+oo . k 

Ee*< < l + ^_|Eef| 

fe=2 

A 2 Ee 2 
< i 1 s * 



< i 



2(1 - Ae) 
6A 2 e 2 



1 - Ae 
1 - Ae + 6A 2 e 2 



1 - Ae 

isfics g(t) ' 

for all t G R. That is, ie* > <(1 +"< + ±t 2 ) for all t £ R. Therefore 



(3.5) 



Notice that g(t) =e t -(l + t+ ±t 2 ) satisfies g(t) > if t > 0, and g(t) < if t < 0. So i$r(i) > 



&e A «* > 6 ( 1 + Afc + ^f- 



Taking expectation, we get 

E&e*« > AE£ 2 + yE^ 3 > AE£ 2 - y^leE^ = (1 - 1.5Ae)AE£ 2 , 
from which, it follows that 

n 

^E&e** 4 > (1 - 1.5Ae)Aa 2 . (3.6) 

?:=i 

Combining (3.5) and (3.6), we obtain the following lower bound of B n (X): for all < A < e _1 , 



E&e*& 
Ee A ? 

Z— 1 

(l-1.5Ae)(l-Ae) 2 
1 - Ae + 6A 2 e 2 

> (1 - 2.4Ae)Aa 2 . (3.7) 
This completes the proof of Lemma 3.1. □ 
We now consider the following cumulant function 



*n(A) = ^logEe^ 1 , 0<A<e _1 . (3.8) 

i=l 

We have the following elementary bound for 4'„(A). 
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Lemma 3.2. For all < A < £ _1 , 
and 



2(1- \ef 

Proof. By Bernstein's condition (1.1), it is easy to see that, for all < A < e -1 , 



k\ 2 ^ l 7 2(1- Ae) 

fc=2 fc=2 v ; 



Then, we have 



b »n( i+ 5^)- <»■•> 



Since the geometric mean does not exceed the arithmetic mean, we get 

A 2 E£? \1 V " „ 1 " / , A 2 E£? 



l(l-\e))\ 2(1- Xe) 

X 2 a 2 
2n(l - Ae) 



i=l 

1 + - ^ TT . (3.10) 



Using (3.10) and the inequality 

log(l + t) <t, t>0, 

we obtain the first assertion of the lemma. Since \I/ n (0) = and ^(A) = B n (X), by Lemma 3.1, 
for all < A < s' 1 , 

f x f x X 2 a 2 

*„(A)=/ B n {t)dt> t(l - 2Ate)a 2 dt = — — (1 - 1.6Ae). 
Jo Jo 2 

Therefore, using again Lemma 3.1, we see that 

-AB n (A) + * n (A) > - [j^ff AV + ^(1 - 1.6A £ ) 

A 2 rx 2 

> 



2(1 -Ae) 6 ' 

which completes the proof of the second assertion of the lemma. □ 
Denote cr 2 (A) = E.\Y, 2 (A). By the relation between E and ~E\, we have 



? 2 fA) - V P***) 2 ) 0<x<£ - 

(Ee^) 2 J ' °" A<e 
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Lemma 3.3. For all < A < £ _1 , 



(1- Ae) 2 (l-3Ae) 



o 2 <a 2 (X) < 



(l-Ae + 6A 2 e 2 ) 2 " w " (1-Ae) 3 ' 
Proof. Denote /(A) = E£ 2 e A «'Ee A «- - (E&e** 4 ) 2 . Then, 

/'(0) = ECf and /"(A) = E&V^Ee^* - (E&V^) 2 > 0. 

Thus, 

/(A) > /(0) + /'(0)A = E£? + AE£? . 
Using (3.12), (3.5) and Bernstein's condition (1.1), we have, for all < A < e" 1 , 

EfcV&Ee^ - (E&e A «*) 2 



> 



> 



> 



(Ee A «<) 2 

(Ee^)2 

1 - Ae 
1 - Ae + 6A 2 e 2 
(l-Ae) 2 (l-3Ae) 
(1 - Ae + 6A 2 e 2 ) 2 



(E£ 2 + AE^ 3 ) 



Eg 



Therefore 



a 2 (A) > 



(1-Ae) 2 (l-3A £ ) 2 



(1-Ae + 6A 2 £ 2 ) 2 ' 
Using Taylor's expansion of and Bernstein's condition (1.1) again, we obtain 



71 

^ 2 (A)<£E^<-^ 



(1-Ae) 3 



12 



(3.11) 



(3.12) 



□ 



This completes the proof of Lemma 3.3. 

For the random variable Y n (X) with < A < e^ 1 , we have the following result on the rate of 
convergence to the standard normal law. 



Lemma 3.4. For all < A < e 1 

Pa 



sup 
yen 



< 13.44 



a 2 e 



Proof. Since Y n (X) = Y]7— i ViW is the sum of independent and centered (respect to Pa) random 
variables r?i(A), by the rate of convergence in the central limit theorem (cf. e.g. Petrov [19], p. 115) 
we get, for < A < 



sup 



1 " 

^=^E Ea M 3 > 

v ' 1=1 
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where C\ > is an absolute constant. For < A < e , using Bernstein's condition, we have 

n n 

5> A |ry 4 | 3 < 4]TE A (|6| 3 + (E A |6|) 3 ) 



13 



i=l 



< 



< 



< 



1=1 

71 

8^E|&| 3 exp{|A6|} 
i=l 
n oo x a 



|3+i 



i=l j=0 
oo 

< 4a 2 e]T(j+3)(j+2)(j + l)(Ae^ 



As 



OO ,3 OO 

£( i + 3)(j + 2)(j + 1)^ = ^3 £V 



6 



j=o 



(l-s)*' 



\x\ < 1, 



we obtain, for < A < e 1 , 



^E A |r, 4 | 3 <24- 



Therefore, we have, for < A < e 1 , 

'Fn(A) 



sup 



<f(A) 



(1-A £ )< 



< 24C*i - 



a 2 s 



13.44 



a 3 (A)(l- Ae) 4 
a 2 e 



a 3 (A)(l-Ae) 4 



where the last step holds as C\ < 0.56 (cf. Shevtsova [28]). 

Using Lemma 3.4, we easily obtain the following lemma. 
Lemma 3.5. For all < A < O.le -1 , 



sup 

y£R 



Pa Y n (X) < 



1 - Ae 



< 1.07Ae + 42.45- 



Proof. Using Lemma 3.3, we have, for all < A < -|e 1 , 



1 - Ae + 6X 2 e 



2^2 



^ - a(X)(l-Xe) ~ (l-Ae)Vl-3Ae' 



It is easy to see that 



< 



Pa ( r n (A) < 
^n(A) 



Pa 

$ 

/1+/2 



< 



1 - Ae 



cr(A) " W(X)(1 - Ae) 



- $ 



ff(A)(l - Ae) 



cr(A)(l - Ae) 



□ 



(3.13) 
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By Lemma 3.4 and (3.13), we get, for all < A < |e 1 , 



h < 13.44 - 



a 2 e 



er 3 (A)(l-Ae) 



- < 13.44 J R(Ae)-. 



Using Taylor's expansion and (3.13), we obtain, for all < A < 

1 



1 y 2 (l-\e) 

h < — 7=Z/e 2 



2n' 

1 y 2 (l-\e) 



< 



a(X)(l-Xe) 

1-Xe + 6X 2 e 2 



2ir 



y/2ew(l - Xe) 



(1 - Ae)Vl -3Ae 
1 - Ae + 6A 2 e 2 



1 



V 



1 - VI - Ae 



(1 - Ae)Vl -3Ae 
By simple calculations, we obtain, for all < A < O.le -1 , 



Pa { Y nW<JZ^)-m 
This completes the proof of Lemma 3.5. 

4. Proofs of Theorems 2.1-2.3 



< 1.07Ae + 42.45-. 

a 



□ 



In this section, we give upper bounds for P(S n > xa). For all x > and < A < e 1 , by (3.1) and 
(3.2), we have: 



P{S n >xa) = BxZniX)- 1 ^^} 

— TP -AS„+*„(A)-i 

— TP c -AB„(A)+*„(A)-AY„(A)-i 

Setting U n (X) = X(Y n (X) + B n (X) - xa), we get 

P(S n >xa) = e -^t,W Exe -(/ o W 1{(/n(A)>0} . 

Since, by Fubini's theorem, for any real random variable U, 



(4.1) 



Ee- U l 



{U>0} 



/ e-*P(0 <U< t)dt, 
Jo 



we deduce, for all x > and < A < e 1 , 

/>oo 

P(S*„>a; C r) = e -A*«r+*»(A) / e -'P A (0< £/„(A) < i)dt. 

Jo 

In the following A^(0, 1) denotes a standard normal random variable. 
^.J. Proof o/ Theorem 2.1 

From (4.2), using Lemma 3.2, we obtain, for all x > and < A < e _1 , 

P(S„ > xa) < e ~ W+ #^T / e~*P A (0 < 14(A) < t.)dt. 

Jo 



(4.2) 



(4.3) 
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For any x > and j3 G [0, 0.5), let A = A(x) G [0, e" 1 ) be the unique solution of the equation 

A - j3\ 2 e x 



15 



(1-Ae) 2 a' 



This definition and Lemma 3.1 imply that 

2x/a 



A = 



and B n (X) < xa. 



1 + 2xe/a + v /i + 4(l-/3)a;e/cr 
Using (4.3) with A = A, we get 

_ poo 

P(S n > Xa) < e -|(l+(l-2«Ae)x 2 / e -*p_( < U n (X) < t)dt, 

Jo 

where 

Act 



(4.4) 



(4.5) 



1 - Xe 

By (4.4) and Lemma 3.5, we have, for < A < O.le -1 , 

e"*Pv(0 < U n {X) < t)dt 



P x (0 < U n (X) < yx)xdy 



< / e _J/a! P (0 < JV(0, 1) <y)xdy + 2- 1.07Ae + 42.45- 



e~ vx d<5> (y) + 2.14Ae + 84.9- 
o ° 

M(x) + 2.14Ae + 84.9- £ 



a 

i 



(4.6) 



Since / °° e"*P x (0 < U n {\) < t)dt < 1 and exp j-f } < s/I-k (1 + t) (cf. (2.17)), combining 

(4.5) and (4.6), we deduce, for all x > 0, 

P(S n > xa) 



{Ae>0.1} 



+ e -^(l-2/3)Aex : 



1 - $ 



(x) + e-^ x2 (2.UXe + 84.9-) 



-*-{A£<0.1} 



with 



and 



< (l-$(2))(7ii + 7i2), 



In = cxp<{ --(1 - 2/3)Ae^ 2 }• V2tt(1 + S; 



l {Ae>0.1} 



(4.7) 
(4.8) 



'12 



2tt(1 + x) ( 



2.14Ae + 84.9- 
a 



L {A e <0.1}' 



Now we shall give estimates for In and ii2- If Ae > 0.1, then I12 = and 



hi < exp<{ -0.1(1 -2/?)y \ V2tt(1 + x) 



(4.9) 
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By a simple calculation, In < 1 provided that x > ( n °te that j3 £ [0, 0.5)). For < x < j-jp , 

we get Act = x(l — Xe) < 1 _ 8 2 „ (1 — 0.1) = Then, using lOAe > 1, wc obtain 



_ 2| gv i 1-2/3 - 



< 1 + 10\/27r(l +x) Acr- 

<7 

1 72^ _ e 

< 1 + T^^ 1 + ^a 
, 180.48 ,„ „ e 

* 1 + T-2^( 1 + ^a- 

If < Xe < 0.1, we have I n = 0. Since 

, e 



1 + V2tt (1 + x) 2.14Ae + 84.9 
V a 

< (l + 2.14\/2^(l+5)Ae) (l + 84.9^ (1 + x) ^ 

it follows that /12 < exp { — 1(1 — 2/3)Aex 2 } Ji J2. Using the inequality 1 + a; < e s , we deduce 

I12 < exp |-Ae^(l-2/?)y- 2.14^(1 + | J 2 . 

Jfx> we see that |(1 - 2/3).? 2 - 2.14v / 27r(l + x) > 0, so Ji 2 < J 2 . For < x < ^f||, we 

get Act = 5(1 - Ae) < yzjff. Then 



/12 < 1 + V27r(l + J) 2.14Ae + 84.9 



< 1 + V27T (1 + x) (2.14Act + 84.9) - 



< 1 + ^(1 + 5) (2.14^^ + 84.9) ~ 

s 1 + (^ + 21 , 813 ) (1 + s) £. ' 



Hence, whenever < Ae < 1, we have 

, 11 + ,„ s 1+ ((^ + 21 , 813 ) v i!2|) (1+i) £. 

Therefore substituting A from (4.4) in the expression of x = 1 A ^ and replacing 1 — 2(3 by <5, we 
obtain, from (4.7) and (4.10), inequality (2.1) in Theorem 2.1. 

4.2. Proof of Theorem 2.2 

For any x > 0, let A = X(x) £ [0,e _1 ) be the unique solution of the equation 

A — 0.5A 2 e x , , „ , 
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By Lemma 3.1, it follows that 

2x/a 



X = 

1 + 2xe/a + Vl + 2xe/a 

Employing (4.3) with A = A, we get 

P(5 n > xa) < exp 
where 

Act 

x = =-. 

1 - Ae 

Using Lemma 3.5 and B n (X) < xa (cf. (4.12)), we have 



and B n (X) < xa. 



f i 2 l 




I 2j 





e"*P x (0 < U n {\) < t)dt, 



e-'P x (0 < U n (X) < t)dt 
e-^P x (0 < U n (X) < yx) xdy 
e -y*p(0 < jv(0,1) < y)xdy 
-2 ■ (l.OlXe + 42.45 J) 1 {0 < 5 < . 18 - 1} + l$ >0 . u -i 



< / e 
Jo 



yi d<$> (y) + (lOXs + 84.9-) 



< M (£) + ( lOXe + 84.9— ) A 1. 



Combining (4.13) and (4.14), we obtain, for all x > 0, 



P(S n > xa) < 1 - $ (x) + exp 



x 

T 
l 



10Ae + 84.9-) A 1 
a 



1 + TTT^r 10Ae + 84.9- Al 
M (x) \\ a) 



Substituting A from (4.12) in the expression of x — A - , we get, for all x > 0, 

1 — \z 



where 



1x 



P{S n >xa) < (1 -$(£)) 
and 



1 + A [x, - - 
a) a 



This completes the proof of Theorem 2.2. 



? A 



M (x) 



17 



(4.12) 



(4.13) 



(4.14) 



(4.15) 



imsart-generic ver. 2011/05/20 file: The_missing_factor_IHP_02.tex date: March 8, 2013 



X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett's inequality 

4.3. Proof of Theorem 2.3 

Let A be defined by (4.12). Using Lemma 3.4 and B n (X) < xa, we have, for all < A < e _1 , 
e"'P x (0 < U n (X) < t)dt 
e~^ (I) P x (0 < U n (X) < y\a(X)) \a(\)dy 



is 



< / e -v*«Wp ( o < jV(0, 1) < y) \a(X)dy + 2 ■ 13.44 _ _^ 



a 2 e 



< / e-^WdSfo) + 26.88 — 

Jo cj 3 (A)(1 - Ae) 4 

= M (Act(A)) + 26.88 ^ 



a 3 (A)(l- Ae) 4 ' 

Using A = A and J °° e"*P x (0 < U n (X) < t)dt < 1, from (4.2) and (4.16), we obtain 
P(S n > xa) < cxp {-Jxa + *„(A)} 



M (Acf(A)) + 26. 
By Lemma 3.2, inequality (4.17) implies that 

P(£Vi > xa) < cxp < — Xxa + nlog 1 - 



a 3 (A)(l-Ae) 4 



T 2 2 
A a z 



A 1 



2n(l - Ae) J J 

(T 2 F 

M (Acr(A)) + 26.88 -^-= = — 

V V " ct 3 (A)(1- Ae) 4 

Substituting A from (4.12) in the previous exponential function, we get 

P(5„ > xa) 



A 1 



< B n (x,- 
a 



M (Act(A)) + 26. 



a 3 (A)(l-Ae) 4 



A 1 



Since M(t) is decreasing in t > and |M'(i)| < , £ > 0, it follows that 
M (Act(A)) - M(a;) < -= (a; - Aer(A)) + . 

v ; ^aV(a) v 7 

Using Lemma 3.3, we deduce 

M (Aa(A)) - M(x) 



< 



< 



< 



1 Ac 



l-0.5Ae (l-AeW(l-3Ae) 



1 - Ae + 6A 2 e 2 



(1 - 0.5Ae)(l - Ae + 6A""e 2 ) - (1 - Ae^l - 3Ae) 
v^F Ae(l - Ae) 4 (l - 3Ae)+/(l - Ae + 6A 2 e 2 ) 



e 
a 



l.lli?(Ae 



(4.16) 



(4.17) 



(4.18) 



(4.19) 
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By Lemma 3.3, it is easy to see that 



26.88 , _ a e _ - < 26.88i? (Xe) -. (4.20) 

a 3 (X)(l~ Xe) A ~ V ' <J 1 ' 

Hence, 

2 

M (XaCX)) + 26.88 5 _ a e _ — < Mix) + 27.99.R (Xe) -. (4.21) 

V V " a 3 (X)(l-Xe) 4 " w v ' a v ' 

Implementing (4.21) into (4.18) and using Xe < x^, we obtain inequality (2.7). 
5. Proof of Theorem 2.4 

In this section, we give a lower bound for P(S n > xa). From Lemma 3.2 and (4.1), it follows that, 
for all < A < e" 1 , 



P(S n >xa) > exp{- 2(1 A ^ A 2 £)6 }E A e-^^l 



{Y n (\)+B n (\)-xa>0}- 

Let A = X(x) e [0,e _1 /4.8] be the unique solution of the equation 

A(l - 2.4Ae)a 2 = xa. (5.1) 

This definition and Lemma 3.1 imply that, for all < x < cr/(9.6e), 

- 2x/a 
~ 1 + y/l - 9.6xe/a 

Therefore, 



and xa < B n (X). (5.2) 



P(S n >xa) > exp {- 2{ i_\ £)6 }^e-^Wl 
Setting V n {X) = XY n {X), we get 



{F„(A)>0}- 



r * 2 i 









e"*P x (0 < V n (X) < t)dt, (5.3) 



where x = — j — . By Lemma 3.4, it is easy to see that 



/>oo 

/ e-*P x (0 < V n (X) < t)dt 
Jo 

= / e" A ^P x (0 < V n (X) < Xya(X)) Xa(X)dy 
Jo 

r°° _ _ 

> / e - A ^)P x (0<iV(0,l)<y) Xa(X)dy-2.13.U-_ 

Jo o- (A)(l - Ae) 4 



a e 



DO 



e -*y*W d $ ( y ) _ 26.88 _ _ J _ - 
v; a 3 (A)(l-Ae) 4 



M (Acr(A)) - 26.88 ^ 



a 2 e 



<x 3 (A)(l-Ae) 4 ' 
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Since M(t) is decreasing in t > and <j(A) < j (cf. Lemma 3.3), it follows that 

r°° _ cr 2 

/ e-'P x (0 < V n (X) < t)dt > M (x)~ 26.88 _ 3 

Jo o- (A)(l - Ae) 4 

Returning to (5.3), we obtain 

P(5 , „>X(t) > 1 - $ (a) - 26.88 exp • 



a 2 e 



2 J a 3 (A)(l - Ae) 4 ' 
Using Lemma 3.3, for all < x < er/(9.6e), we have < Ae < 1/4.8 and 

^(A)(l-^>"- Ie ' 7 "-f)' /2 .'. 

(1-Ae + 6A e 2 ) 3 

Therefore, for all < x < cr/(9.6e), 

P(S n >xa) > 1 - $ (x) - 26.88i? (Ae) ^ exp |~ yj ■ 
Using the inequality exp | — y j < \/27r (1 + t) for £ > 0, we get, for all < x < a/ (9.6s), 

P(S n >xa) > (l-$(x)) 1 - 67.38i? (Ae) (1 + x) - . 

In particular, for all < x < aa/e with < a < 1/9.6, a simple calculation shows that 

^ 2a 1 

< Ae < ; < — 

~ 1 + a/1 - 9.6a ~ 4.8 

and 

67.38i?(Ae) < 67.38i? ( 2 " ) < 67.38i? ( — ) < 1753.23. 

This completes the proof of Theorem 2.4. 

6. Proof of Theorem 2.5 

We will use (4.2). Notice that *^(A) £ [0, co) is increasing in A > 0. Let A = ~\(x) > be the 
unique solution of the equation xa = ^(A). This definition implies that B n {\) = ^(A) = xa, 
U n (\) = XY„(X) and 

e -A^+*„(A) = M e -Axa+* n (A) = infEe A(S»-^). 

A>0 A>0 



Using Lemma 3.4 with A = A, we have 

/>oo 

/ e~*P x (0 < U n (X) < t)dt 
Jo 



/>oo _ _ 

/ e-y x ^Pj (0 < Y n (X) < ya(X)) Xa(\)dy 
Jo 

°° e -vAa(A) p (Q < Ar(0j 1} < y) Jw( j )dy + 2_nAW^e 



o v v ' a 3 (A)(l-Ae) 4 

26.886»icr 2 e 



e - yXa ^d^{y) 



M (Act(A)) + - 



a 3 (A)(l-Ae) 4 
26.880icr 2 e 



a 3 (A)(l- Ae) 4 ' 
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where |#i| < 1. Therefore by (4.2), we obtain 

Since M(t) is decreasing in t > and |M'(t)| < 1 1 > 0, it follows that 

\m (act(a)) — m(x)\ < -L k I _ (63) 

By Lemma 3.1, we have the following two-sided bound 

(1-1-5*0(1;^ < m =x< IZ^k Act. (6.4) 
l-Ae + 6A 2 e2 " a " (1 - Ae) 2 

Using the two-sided bound in Lemma 3.3 and (6.4), by a simple calculation, we deduce 

AV(A) A , 2 > (1 ~y 2(1 ~ 3l£) AV (6.5) 
(1 — Ae + 6A e 2 ) 2 

and 



_ / l-0.5Ae (1-A £ )^(1-3A£ )- 
(1 - Ae) 2 1 - Ae + 6A~ 2 e 2 



|s-A<t(A)| < Act | - _ ^ — |. (6.6) 



From (6.3), (6.5), (6.6) and Lemma 3.3, we easily obtain 



\M (Act(A)) - M{x)\ < l.lLR(Ae)^. (6.7) 



By Lemma 3.3, it is easy to see that 



26 - 88a2£ < 26.88i?(A £ )i. (6.8) 



a 3 (A)(l-Ae) 4 " y " J a 

Combining (6.7) and (6.8), we get, for all < A < |e _1 , 

M (Act (A)) + J^^j £)4 = M(x) + 27.996 2 R (Ae) J, (6.9) 

where \$2\ < 1. Implementing (6.9) into (6.2) and using Ae < x—, we obtain equality (2.14) of 
Theorem 2.5. 
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